Goto

Collaborating Authors

 Sharjah


Chiplet-Based RISC-V SoC with Modular AI Acceleration

Bharadwaj, Suhas Suresh, Ramkumar, Prerana

arXiv.org Artificial Intelligence

Achieving high performance, energy efficiency, and cost-effectiveness while maintaining architectural flexibility is a critical challenge in the development and deployment of edge AI devices. Monolithic SoC designs struggle with this complex balance mainly due to low manufacturing yields (below 16%) at advanced 360 mm^2 process nodes. This paper presents a novel chiplet-based RISC-V SoC architecture that addresses these limitations through modular AI acceleration and intelligent system level optimization. Our proposed design integrates 4 different key innovations in a 30mm x 30mm silicon interposer: adaptive cross-chiplet Dynamic Voltage and Frequency Scaling (DVFS); AI-aware Universal Chiplet Interconnect Express (UCIe) protocol extensions featuring streaming flow control units and compression-aware transfers; distributed cryptographic security across heterogeneous chiplets; and intelligent sensor-driven load migration. The proposed architecture integrates a 7nm RISC-V CPU chiplet with dual 5nm AI accelerators (15 TOPS INT8 each), 16GB HBM3 memory stacks, and dedicated power management controllers. Experimental results across industry standard benchmarks like MobileNetV2, ResNet-50 and real-time video processing demonstrate significant performance improvements. The AI-optimized configuration achieves ~14.7% latency reduction, 17.3% throughput improvement, and 16.2% power reduction compared to previous basic chiplet implementations. These improvements collectively translate to a 40.1% efficiency gain corresponding to ~3.5 mJ per MobileNetV2 inference (860 mW/244 images/s), while maintaining sub-5ms real-time capability across all experimented workloads. These performance upgrades demonstrate that modular chiplet designs can achieve near-monolithic computational density while enabling cost efficiency, scalability and upgradeability, crucial for next-generation edge AI device applications.


A Survey of Generative Categories and Techniques in Multimodal Generative Models

Han, Longzhen, Mubarak, Awes, Baimagambetov, Almas, Polatidis, Nikolaos, Baker, Thar

arXiv.org Artificial Intelligence

Multimodal Generative Models (MGMs) have rapidly evolved beyond text generation, now spanning diverse output modalities including images, music, video, human motion, and 3D objects, by integrating language with other sensory modalities under unified architectures. This survey categorises six primary generative modalities and examines how foundational techniques, namely Self-Supervised Learning (SSL), Mixture of Experts (MoE), Reinforcement Learning from Human Feedback (RLHF), and Chain-of-Thought (CoT) prompting, enable cross-modal capabilities. We analyze key models, architectural trends, and emergent cross-modal synergies, while highlighting transferable techniques and unresolved challenges. Building on a common taxonomy of models and training recipes, we propose a unified evaluation framework centred on faithfulness, compositionality, and robustness, and synthesise evidence from benchmarks and human studies across modalities. We further analyse trustworthiness, safety, and ethical risks, including multimodal bias, privacy leakage, and the misuse of high-fidelity media generation for deepfakes, disinformation, and copyright infringement in music and 3D assets, together with emerging mitigation strategies. Finally, we discuss how architectural trends, evaluation protocols, and governance mechanisms can be co-designed to close current capability and safety gaps, outlining critical paths toward more general-purpose, controllable, and accountable multimodal generative systems.


Low-Bit, High-Fidelity: Optimal Transport Quantization for Flow Matching

Varam, Dara, Abuhani, Diaa A., Zualkernan, Imran, AlDamani, Raghad, Khalil, Lujain

arXiv.org Artificial Intelligence

Flow Matching (FM) generative models offer efficient simulation-free training and deterministic sampling, but their practical deployment is challenged by high-precision parameter requirements. We adapt optimal transport (OT)-based post-training quantization to FM models, minimizing the 2-Wasserstein distance between quantized and original weights, and systematically compare its effectiveness against uniform, piecewise, and logarithmic quantization schemes. Our theoretical analysis provides upper bounds on generative degradation under quantization, and empirical results across five benchmark datasets of varying complexity show that OT-based quantization preserves both visual generation quality and latent space stability down to 2-3 bits per parameter, where alternative methods fail. This establishes OT-based quantization as a principled, effective approach to compress FM generative models for edge and embedded AI applications.


Federated Learning with Gramian Angular Fields for Privacy-Preserving ECG Classification on Heterogeneous IoT Devices

Elmir, Youssef, Himeur, Yassine, Amira, Abbes

arXiv.org Artificial Intelligence

This study presents a federated learning (FL) framework for privacy-preserving electrocardiogram (ECG) classification in Internet of Things (IoT) healthcare environments. By transforming 1D ECG signals into 2D Gramian Angular Field (GAF) images, the proposed approach enables efficient feature extraction through Convolutional Neural Networks (CNNs) while ensuring that sensitive medical data remain local to each device. This work is among the first to experimentally validate GAF-based federated ECG classification across heterogeneous IoT devices, quantifying both performance and communication efficiency. To evaluate feasibility in realistic IoT settings, we deployed the framework across a server, a laptop, and a resource-constrained Raspberry Pi 4, reflecting edge-cloud integration in IoT ecosystems. Experimental results demonstrate that the FL-GAF model achieves a high classification accuracy of 95.18% in a multi-client setup, significantly outperforming a single-client baseline in both accuracy and training time. Despite the added computational complexity of GAF transformations, the framework maintains efficient resource utilization and communication overhead. These findings highlight the potential of lightweight, privacy-preserving AI for IoT-based healthcare monitoring, supporting scalable and secure edge deployments in smart health systems.


Image and Point-cloud Classification for Jet Analysis in High-Energy Physics: A survey

Kheddar, Hamza, Himeur, Yassine, Amira, Abbes, Soualah, Rachik

arXiv.org Artificial Intelligence

Nowadays, there has been a growing trend in the field of high-energy physics (HEP), in both its experimental and phenomenological studies, to incorporate machine learning (ML) and its specialized branch, deep learning (DL). This review paper provides a thorough illustration of these applications using different ML and DL approaches. The first part of the paper examines the basics of various particle physics types and establishes guidelines for assessing particle physics alongside the available learning models. Next, a detailed classification is provided for representing Jets that are reconstructed in high-energy collisions, mainly in proton-proton collisions at well-defined beam energies. This section covers various datasets, preprocessing techniques, and feature extraction and selection methods. The presented techniques can be applied to future hadron-hadron colliders (HHC), such as the high-luminosity LHC (HL-LHC) and the future circular collider - hadron-hadron (FCChh). The authors then explore several AI techniques analyses designed specifically for both image and point-cloud (PC) data in HEP. Additionally, a closer look is taken at the classification associated with Jet tagging in hadron collisions. In this review, various state-of-the-art (SOTA) techniques in ML and DL are examined, with a focus on their implications for HEP demands. More precisely, this discussion addresses various applications in extensive detail, such as Jet tagging, Jet tracking, particle classification, and more. The review concludes with an analysis of the current state of HEP using DL methodologies. It highlights the challenges and potential areas for future research, which are illustrated for each application.


Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization

Tariq, Asadullah, Qayyum, Tariq, Serhani, Mohamed Adel, Sallabi, Farag, Taleb, Ikbal, Barka, Ezedin S.

arXiv.org Artificial Intelligence

Federated Learning (FL) enables participant devices to collaboratively train deep learning models without sharing their data with the server or other devices, effectively addressing data privacy and computational concerns. However, FL faces a major bottleneck due to high communication overhead from frequent model updates between devices and the server, limiting deployment in resource-constrained wireless networks. In this paper, we propose a three-fold strategy. Firstly, an Adaptive Feature-Elimination Strategy to drop less important features while retaining high-value ones; secondly, Adaptive Gradient Innovation and Error Sensitivity-Based Quantization, which dynamically adjusts the quantization level for innovative gradient compression; and thirdly, Communication Frequency Optimization to enhance communication efficiency. We evaluated our proposed model's performance through extensive experiments, assessing accuracy, loss, and convergence compared to baseline techniques. The results show that our model achieves high communication efficiency in the framework while maintaining accuracy.


Extracting Actionable Insights from Building Energy Data using Vision LLMs on Wavelet and 3D Recurrence Representations

Bechar, Amine, Oulefki, Adel, Amira, Abbes, Kurogollu, Fatih, Himeur, Yassine

arXiv.org Artificial Intelligence

The analysis of complex building time-series for actionable insights and recommendations remains challenging due to the nonlinear and multi-scale characteristics of energy data. To address this, we propose a framework that fine-tunes visual language large models (VLLMs) on 3D graphical representations of the data. The approach converts 1D time-series into 3D representations using continuous wavelet transforms (CWTs) and recurrence plots (RPs), which capture temporal dynamics and localize frequency anomalies. These 3D encodings enable VLLMs to visually interpret energy-consumption patterns, detect anomalies, and provide recommendations for energy efficiency. We demonstrate the framework on real-world building-energy datasets, where fine-tuned VLLMs successfully monitor building states, identify recurring anomalies, and generate optimization recommendations. Quantitatively, the Idefics-7B VLLM achieves validation losses of 0.0952 with CWTs and 0.1064 with RPs on the University of Sharjah energy dataset, outperforming direct fine-tuning on raw time-series data (0.1176) for anomaly detection. This work bridges time-series analysis and visualization, providing a scalable and interpretable framework for energy analytics.


Quantum-based QoE Optimization in Advanced Cellular Networks: Integration and Cloud Gaming Use Case

Chaouech, Fatma, Villegas, Javier, Pereira, António, Baena, Carlos, Fortes, Sergio, Barco, Raquel, Gribben, Dominic, Dib, Mohammad, Villarino, Alba, Cortines, Aser, Orús, Román

arXiv.org Artificial Intelligence

This work explores the integration of Quantum Machine Learning (QML) and Quantum-Inspired (QI) techniques for optimizing end-to-end (E2E) network services in telecommunication systems, particularly focusing on 5G networks and beyond. The application of QML and QI algorithms is investigated, comparing their performance with classical Machine Learning (ML) approaches. The present study employs a hybrid framework combining quantum and classical computing leveraging the strengths of QML and QI, without the penalty of quantum hardware availability. This is particularized for the optimization of the Quality of Experience (QoE) over cellular networks. The framework comprises an estimator for obtaining the expected QoE based on user metrics, service settings, and cell configuration, and an optimizer that uses the estimation to choose the best cell and service configuration. Although the approach is applicable to any QoE-based network management, its implementation is particularized for the optimization of network configurations for Cloud Gaming services. Then, it is evaluated via performance metrics such as accuracy and model loading and inference times for the estimator, and time to solution and solution score for the optimizer. The results indicate that QML models achieve similar or superior accuracy to classical ML models for estimation, while decreasing inference and loading times. Furthermore, potential for better performance is observed for higher-dimensional data, highlighting promising results for higher complexity problems. Thus, the results demonstrate the promising potential of QML in advancing network optimization, although challenges related to data availability and integration complexities between quantum and classical ML are identified as future research lines.


Unsupervised Urban Tree Biodiversity Mapping from Street-Level Imagery Using Spatially-Aware Visual Clustering

Abuhani, Diaa Addeen, Seccaroni, Marco, Mazzarello, Martina, Zualkernan, Imran, Duarte, Fabio, Ratti, Carlo

arXiv.org Artificial Intelligence

Urban tree biodiversity is critical for climate resilience, ecological stability, and livability in cities, yet most municipalities lack detailed knowledge of their canopies. Field-based inventories provide reliable estimates of Shannon and Simpson diversity but are costly and time-consuming, while supervised AI methods require labeled data that often fail to generalize across regions. We introduce an unsupervised clustering framework that integrates visual embeddings from street-level imagery with spatial planting patterns to estimate biodiversity without labels. Applied to eight North American cities, the method recovers genus-level diversity patterns with high fidelity, achieving low Wasserstein distances to ground truth for Shannon and Simpson indices and preserving spatial autocorrelation. This scalable, fine-grained approach enables biodiversity mapping in cities lacking detailed inventories and offers a pathway for continuous, low-cost monitoring to support equitable access to greenery and adaptive management of urban ecosystems.


A Cost-Effective Framework for Predicting Parking Availability Using Geospatial Data and Machine Learning

Bagosher, Madyan, Mustafa, Tala, Alsmirat, Mohammad, Al-Ali, Amal, Jawarneh, Isam Mashhour Al

arXiv.org Artificial Intelligence

As urban populations continue to grow, cities face numerous challenges in managing parking and determining occupancy. This issue is particularly pronounced in university campuses, where students need to find vacant parking spots quickly and conveniently during class timings. The limited availability of parking spaces on campuses underscores the necessity of implementing efficient systems to allocate vacant parking spots effectively. We propose a smart framework that integrates multiple data sources, including street maps, mobility, and meteorological data, through a spatial join operation to capture parking behavior and vehicle movement patterns over the span of 3 consecutive days with an hourly duration between 7AM till 3PM. The system will not require any sensing tools to be installed in the street or in the parking area to provide its services since all the data needed will be collected using location services. The framework will use the expected parking entrance and time to specify a suitable parking area. Several forecasting models, namely, Linear Regression, Support Vector Regression (SVR), Random Forest Regression (RFR), and Long Short-Term Memory (LSTM), are evaluated. Hyperparameter tuning was employed using grid search, and model performance is assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Coefficient of Determination (R2). Random Forest Regression achieved the lowest RMSE of 0.142 and highest R2 of 0.582. However, given the time-series nature of the task, an LSTM model may perform better with additional data and longer timesteps.